home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Aminet 1 (Walnut Creek)
/
Aminet - June 1993 [Walnut Creek].iso
/
usenet
/
sources
/
volume90
/
util
/
fam_1_1
/
part01
/
FAM.doc
< prev
next >
Wrap
Text File
|
1990-02-16
|
16KB
|
333 lines
This program allows multiple ARexx programs to access a buffered version of
a directory in a consistent and serialized manner. This program and all
derivative works are copyright 1990 Darren New. All Rights Reserved.
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted, provided
that the above copyright notice appear in all copies and that both that
copyright notice and this permission notice appear in supporting
documentation, and that the names of the copyright holder or author not be
used in advertising or publicity pertaining to disstribution of the
software without specific, written prior permission.
BOTH THE AUTHOR AND THE COPYRIGHT HOLDER DISCLAIM ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE COPYRIGHT HOLDER OR THE
AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY
DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN
AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
If you use this program, especially if you redistribute it with your own
application, I request that you drop me a note at new@udel.edu (internet)
to make me feel good. Also, if you write a GREP function that may have more
general utility, please feel free to send it to me if you wish for
inclusion in future distributions of FAM.
VERSION 1.1 Feb 1990
-------------------------------------------------------------------------
This program allows multiple ARexx programs to access a buffered version of
a directory in a consistent and serialized manner. This program and all
derivative works are copyright 1990 Darren New. All Rights Reserved.
In essense, it is freely distributable; see the .doc files for
distribution permissions.
This program does several things with the files in a directory.
One: it buffers all the names, dates, sizes, and so on, for quick access.
Two: it allows a program to "lock" a file, meaning that the file is marked
as locked in the buffer. This locking does not affect the file in any way.
The locking operation is a "test-and-set" type operation (not the TAS
instruction :-). This can be used to prevent two processes from attempting
to write the file at the same time.
Three: it buffers a count of the number of lines in each file if requested.
Files with specific extensions (.zoo, .arc, .lhz, ...) can be excluded from
this count.
Four: it allows files with digits in their names to be created atomically
based on the names of files already there. That is, if the directory
contains the files X01, X02, and X03, two programs can request new files
starting with X and one will get X04 and the other will get X05.
Five: it allows the first few lines of each file to be buffered in memory.
This allows for quick scans of file headers and such.
Six: it allows a user-compiled function to be used in finding files meeting
certain requirements. For example, the included function returns the names
of files starting with a given string, containing a line beginning with
the given text, and with that line containing a given word. For example,
find files whose name starts with "comp.sys.amiga" with a line starting
with "Subject:" and that line contains the word "FAM". The source code
is not needed to change this function and multiple copies running concurrently
may ue different functions, each LoadSeg'ed at runtime. I believe that
Manx functions can be used with the Lattice executables as no BLink
linking is needed.
All these operations are available to ARexx programs via an ADDRESS
instruction and OPTIONS RESULTS. Multiple copies may run concurrently
as the ARexx port and directory name (as well as other parameters)
are specified on the command line. Without ARexx, this program is useless.
Its original intent is to serve as support for applications written
mostly in ARexx. The source zoo contains the MinRexx and RexxGlue source
for use with Lattice 5.04 (slightly enhanced to handle non-Rexx messages).
-------------------------------------------------------------------------
TO DO:
Attempt to speed up line-counting by bypassing strchr(), which takes
about 75% of the time on test cases.
Try adding MEMF_LARGEST to malloc of buffer for entire file in an
attempt to reduce memory fragmentation during rescan.
Add control-C checking: a control-C will abort a rescan and execute
an EXPUNGE NOW
For NEWFILE, allow an option of either base 10 or base 36.
36**2= 1,296
36**3= 44,656
36**4=1,679,616
Add an optional parameter that specifies one of four rescan modes:
0) the default. rescan always counts everything.
1) rescan scans all names but leaves linecounts as -1.
When there are no pending REXX messages, find the next file with
no line count and scan it. Stop after one scan of the list.
Check for REXX messages after each file scanned.
2) rescan scans all names but leaves linecounts as -1.
When GETINFO is called for a file with no linecount then scan
the file before returning the information if possible.
3) like (2) except that the information is discarded if not used
within a "reasonable" time or if memory gets short or ...
Note that these rescan options will cause a problem for current FAMgrep
modules. There will need to be returns from FAMgrep saying "rescan it,
and then I'll look again."
Have the option of saving the buffer out to a file for quick startup?
Or is this just silly? If done, toss file if older than directory.
Alternately, reload the file buffer and then rescan, updating
entries with different dates.
When started, the parameters to the program must be as follows:
argv[0] - program name
argv[1] - ARexx host port name to use
argv[2] - Full path to directory
argv[3] - Executable to load for GREP command
argv[4] - string of extensions not to examine the contents of. If
ommitted, no files will be openned during rescans.
argv[5] - string that terminates line buffering. If ommitted, contents
of files will not be buffered.
It is assumed that you do not wish to recurse into lower-level directories.
This restriction may be lifted in a future version if I need it.
If you define CBACK and link with CBACK.O, this will run totally in the
background once the initial startup message is outputted; do not use RUN.
In a binary distribution, this form of the executable would be called
FAM-CBACK.
If you define CRES and link with CRES.O, this will be residentable. In a
binary distribution, this would be called FAM-CRES.
If you define LMS then this will use the ASDG Low Memory Server if
available. However, the ASDG LMS has been reported to contain bugs. If the
lowmem.library is not around, the low memory checking is disabled. Note,
however, that running out of memory will not crash this program.
Currently, the author is using this program as one component of DIBBS,
Darren's Innovative Bulletin Board Server. This bulletin board, soon to be
released as shareware, is a multi-user message handling system written
mostly in ARexx. Another possible use is an ARexx-driven relational
database system with each relation being stored in a separate file. The
attribute names could be stored at the start of each relation, the comment
could store the name of the relation as the user sees it, and the line
count would supply information on the number of tuples in the relation. I
suspect that there are several more applications that could benefit from
this host.
-------------------------------------------------------------------------
The commands that this host accepts are as follows:
VERSION
Returns a string indicating what version of the program this is.
OPEN
Causes the directory to be scanned into memory if not already
there. This list is herein known as the ScanList. Each OPEN must be
balanced by a CLOSE.
CLOSE [NOW]
Causes the ScanList to be discarded if there have been more CLOSEs
than OPENs and CLEAR has been issued. The optional NOW paramter
causes enough CLOSEs to be issued to balance all outstanding OPENs.
CLEAR
Causes the ScanList to be discarded at the next time that there
have been more CLOSEs than OPENs.
EXPUNGE [NOW]
Causes the program to exit if there have been more CLOSEs than
OPENs. If there are still outstanding OPENs, causes the program to
exit when it receives the final CLOSE command. Note that this will
take down the ARexx port if and when the directory is closed. With
the optional keyword "NOW" it will discard the ScanList and then
expunge, causing errors to be returned to all subsequent callers.
There is currently no way to undo the effect of an EXPUNGE command
once issued.
GETIOERR
Returns in result string the decimal representation of the IoErr()
value. Handy after OPEN or RESCAN or RESCAN1 returns RC > 0. Note
that IoErr() of 232 is "NO MORE ENTRIES IN DIRECTORY" and is a
normal return code from ExNext().
GETDIRNAME
Returns in result the directory path passed as argv[2].
GETBINEXTENS
Returns in result the string of extensions passed in argv[4].
Returns RC=1 if no such argument was given.
GETBUFENDLINE
Returns in result the string that terminates line buffering as
passed in argv[5]. Returns RC=1 if no such argument was given.
GETNAMECOUNT
Returns in result the number of names in the list. This will be
zero (without error) if the list will need rescanning.
GETNAMES [pad]
Returns all names, separated by the indicated pad character
(default is space), sorted alphabetically, that are in the
ScanList. If finer distinctions are needed, see GREP.
GETINFO name
Returns the information for the indicated name in the ScanList. The
information is formatted as
DIR|FILE PROTECTIONS SIZEBYTES SIZELINES DATE TIME LOCK NAME NOTE\nCONTENTS
where DIR|FILE is directory or file.
PROTECTIONS are HSPARWED in some order for 1-bit protections.
The string starts with a '-' and contains one character for
each bit SET. Note that interpretation of some bits are
funny.
SIZEBYTES is the size of the file in bytes.
SIZELINES is the size of the file in lines or -1 if file could not
be read or -2 if the file has a GETBINEXTENS extension.
DATE is the last change date of the file.
TIME is the last change time in seconds since midnight.
LOCK is N for NEWFILE, - for available, and L for locked.
NAME is the file name.
NOTE is the comment on the file.
CONTENTS is the buffered contents (if any) of the file. Each line
of contents (including the last) ends with a linefeed. If there
are no contents then the NOTE field is not ended with a
newline.
If the file ends in one of the extensions listed in argv[4] then
the file is assumed to be some sort of binary file and it is not
opened during a rescan. This means that its lines are not counted
and that none of the file's contents are buffered. (Argv[4] should
be all the extensions with the period. For example, ".zoo.arc.exe")
If the extension is not one of the GETBINEXTENS then it is read
into memory (if readable) and then the number of newlines is
counted and stored in the SIZELINES. If the GETBUFENDLINE string is
found at the beginning of a line, then all lines of the file up to
but not including the line beginning with the GETBUFENDLINE string
is buffered and returned by GETINFO. If the contents information is
not present in spite of having a non-negative SIZELINES then this
indicates that the file has an extension not in GETBINEXTENS but
does not contain a line starting with the GETBUFENDLINE string.
If memory is exhausted attempting to store the contents then no
contents are stored.
RESCAN [fpat]
Rescans the directory and rebuilds the ScanList for the files
matching the indicated pattern. Note that file names placed in the
ScanList by the NEWFILE command (and not yet subjected to RESCAN1)
are not modified by this command.
RESCAN1 fname
Rescans the indicated single file and rebuilds the control
information for just that file. This is more efficient than RESCAN.
It also allows a file name created with NEWFILE to become a regular
file, subject to rescanning and all. It also allows a file that has
been deleted from the directory to be removed from the ScanList
without a complete RESCAN.
GREP[pad][pat]
Looks through all files, handing each file name structure and 'pat'
to the user's function to determine a match. Returns list of file
names for which user's function returned true. See below for
information on the user's function. The first character after the
keyword GREP (i.e., the fifth character of the command) can be
anything, not just a space. If it is not a space, it will be used
as the separation character.
NEWFILE prefix
Finds in the ScanList the last (alphabetically) file that starts
with the given prefix, interprets the remainder of that file's name
as an integer (meaning that the file's name must end in a string of
digits), adds one to the integer (zero filling on the left to make
the name as long as the alphabetically previous name), and adds
that name to the ScanList, and returns the new name. It does all
this atomically with respect to handling ARexx commands. It does
not actually create the file. The file is flagged internally in
such a way as to survive RESCAN calls even if the file does not
actually exist. If there is no file with the indicated prefix,
RC=20 is returned. If there is an overflow (i.e., there is a file
whose name is the prefix followed by some number of '9's), RC=10 is
returned. In either error case, no name is added to the buffered
directory list. This command is used to create a new file in a
directory without possibility of overlapping use of file names.
Note that once the file is written and closed, the ARexx program
should call RESCAN1 to update the directory information for real.
If the user decides to delete the file, calling RESCAN1 when the
file does not exist will cause it to be removed from the internal
list.
NEWFILE10 prefix
A synonym for NEWFILE.
NEWFILE36 prefix
This functions identically to NEWFILE except that all digits are
assumed to be in base 36. That is, the digits
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
are used instead of the normal ten. Notice that these are uppercase
letters and that the letters come after the numbers. This is done
to allow denser numbers in references to this file. NOT YET
IMPLEMENTED!!!
LOCKFILE prev new name
If name is in the ScanList, this looks at it's LOCK status (as
returned by GETINFO). If the status matches prev (1 character) then
the status is changed to new and RC=0. Otherwise, RC>0. This is an
atomic operation with respect to other ARexx messages.
----------------------------------------------------------------------------
The user's function called by GREP must be stored in the executable file
named in the command line. At startup, this file is LoadSeg()ed. When the
GREP command is called, this function will be passed a long pointer to
a character (mpat) followed by a pointer to a ScanListNode. It is
expected to return a boolean (0 or 1) in D0. Hence, from C we call
((long (*)(struct ScanListNode *, char *))(GrepSeg))(ScanList, p);
and use the result. Hence, the function must have its entry point at the
start of the file, must take its arguments from the stack, must return its
result in D0, and must be serially reusable.